Tony Reina
4 JULY 2017
This is a very basic example of doing linear regression with Intel-Nervana's neon Deep Learning platform. It is based on this code.
This code shows that neon is not just for neural networks. It can handle all sorts of numerical computations and optimizations.
Linear regression is a common statistical method for fitting a line to data. It allows us to create a linear model so that we can predict outcomes based on new data.
We'll generate a simple line with some random noise and then use gradient descent to determine the parameters.
This also shows how to load custom data (e.g. user generated numpy arrays) into the neon DataIterator (ArrayIterator).
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This sets up either our GPU or CPU connection to neon. If we don't start with this, then ArrayIterator won't execute.
We're asking neon to use the cpu, but can change that to a gpu if it is avaliable. Batch size refers to how many data points are taken at a time. For example, here we want to computer the gradient of 5 data points at a time. Here's a primer on Gradient Descent.
Technical note: Your batch size must always be much less than the number of points in your data. So if you have 50 points, then set your batch size to something much less than 50.
We can use this to generate datasets that neon understands for any custom data (e.g. python, numpy, etc.). The default behavior is to automatically turn the labels (y) into one-hot encoding for classification problems. We'll override that because we want to do regression (i.e. continuous values)
In [183]:
import numpy as np
m = 123.45 # Slope of our line (weight)
b = -67.89 # Intercept of our line (bias)
numDataPoints = 100 # Let's just have 100 total data points
X = np.random.rand(numDataPoints, 1) # Let's generate a vector X with numDataPoints random numbers
noiseScale = 1.2 # The larger this value, the noisier the data.
trueLine = m*X + b # Let's generate a vector Y based on a linear model of X
y = trueLine + noiseScale * np.random.randn(numDataPoints, 1) # Let's add some noise so the line is more like real data.
In [193]:
from neon.data import ArrayIterator
from neon.backends import gen_backend
gen_backend(backend='gpu', batch_size=2) # Change to 'gpu' if you have gpu support
train = ArrayIterator(X=X, y=y, make_onehot=False)
In [194]:
import matplotlib.pyplot as plt
%matplotlib inline
In [195]:
plt.figure(figsize=(10,7))
plt.scatter(X, y, alpha=0.7, color='g')
plt.plot(X, trueLine, alpha=0.5, color='r')
plt.title('Raw data is a line with slope (m) of {} and intercept (b) of {}'.format(m, b), fontsize=14);
plt.grid('on');
plt.legend(['True line', 'Raw data'], fontsize=18);
In [196]:
from neon.initializers import Gaussian
from neon.optimizers import GradientDescentMomentum
from neon.layers import Linear, Bias
from neon.layers import GeneralizedCost
from neon.transforms import SumSquared
from neon.models import Model
from neon.callbacks.callbacks import Callbacks
In [197]:
init_norm = Gaussian(loc=0.0, scale=1)
In [198]:
layers = [Linear(1, init=init_norm), # Linear layer with 1 unit
Bias(init=init_norm)] # Bias layer
model = Model(layers=layers)
In [199]:
# Loss function is the squared difference
cost = GeneralizedCost(costfunc=SumSquared())
All of our models will use gradient descent. We will iteratively update the model weights and biases in order to minimize the cost of the model.
In [200]:
optimizer = GradientDescentMomentum(0.1, momentum_coef=0.9)
In [201]:
# Execute the model
model.fit(train,
optimizer=optimizer,
num_epochs=11,
cost=cost,
callbacks=Callbacks(model))
In [202]:
# print weights
slope = model.get_description(True)['model']['config']['layers'][0]['params']['W'][0][0]
print ("calculated slope = {:.3f}, true slope = {:.3f}".format(slope, m))
bias_weight = model.get_description(True)['model']['config']['layers'][1]['params']['W'][0][0]
print ("calculated bias = {:.3f}, true bias = {:.3f}".format(bias_weight, b))
In [203]:
plt.figure(figsize=(10,7))
plt.plot(X, slope*X+bias_weight, alpha=0.5, color='b', marker='^')
plt.scatter(X, y, alpha=0.7, color='g')
plt.plot(X, trueLine, '--', alpha=0.5, color='r')
plt.title('How close is our predicted model?', fontsize=18);
plt.grid('on');
plt.legend(['Predicted Line', 'True line', 'Raw Data'], fontsize=18);
In [ ]:
In [ ]: